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Amendments to the Claims: This listing of claims will replace all prior versions, and listings, 
of claims in the application 

Listing of Claims: 

1. (Currently Amended) A method of speaker normalization comprising the steps 

of: 

segmenting an input speech utterance into frames of a constant time length and 
extracting an acoustic feature parameter of each of the frames; 

for each of the frames, frequency-converting the respective acoustic feature parameter 
by filtering with a plurality of predetermined frequency conversion coefficients to form a 
corresponding plurality of frequency-converted feature parameters; 

determining, for each of the frames, a plurality of similarities or distances between each 
of the frequency-converted feature parameters and a standard phonemic model , the standard 
phonemic model being a group of phonemes ; 

selecting at least one of the plurality of predetermined frequency conversion coefficients, 
representing a frequency converting condition for normalizing the input utterance, by using the 
determined plurality of similarities or distances for each of the frames; and 

normalizing the input utterance by frequency-converting the input utterance using the 
selected at least one predetermined frequency conversion coefficient. 

2. (Previously Presented) A method according to claim 1, wherein the step of 
selecting at least one of the predetermined frequency conversion coefficients includes a step of 
mutually comparing between the determined plurality of similarities or distances included in an 
input frame constituted by the frame, a step of selecting for each frame a maximum likelihood, 
combination of a phoneme and at least one of the plurality of predetermined frequency 
conversion coefficients by using a result of the comparison, and a step of cumulating the 
frequency of the frequency conversion coefficient in a maximum likelihood over plural frames 
and selecting at least one of the plurality of predetermined frequency conversion coefficients 
having a highest frequency as the frequency converting condition. 
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3. (Previously Presented) A method according to claim 1, wherein the step of 
selecting at least one of the predetermined frequency conversion coefficients includes a step of 
mutually comparing between the determined plurality of similarities or distances included in an 
input frame constituted by the frame, a step of selecting a set of a phoneme of the standard 
phonemic model and at least one of the plurality of predetermined frequency conversion 
coefficients that provides a result of maximum likelihood, and a step of selecting at least one of 
the plurality of predetermined frequency conversion coefficients as the frequency converting 
condition of the frame. 

4. (Previously Presented) A method according to claim 1, wherein the step of 
determining the plurality of similarities or distances further includes a step of determining, for 
each frame, a ratio in similarity or distance of the phoneme as a weight by using the acoustic 
feature parameter of the frame and the standard phonemic model, the step of selecting at least 
one of the plurality of predetermined frequency conversion coefficients including a step to select 
the frequency converting condition by using the weight. 

5. (Previously Presented) A method according to claim 4, wherein the step of 
determining the ratio in similarity or distance of the phoneme as the weight includes a step of 
selecting for each frame at least one of the plurality of predetermined frequency conversion 
coefficients in a maximum likelihood with respect to all the phonemes of the standard phonemic 
model, a step of deciding a phoneme-based frequency converting condition for all the 
phonemes, on all the phonemes of the standard phonemic model, from a result of cumulating 
phoneme by phoneme the frequency converting condition in the maximum likelihood over 
plural frames, and a step of using the phoneme-based frequency converting condition and the 
similarity or distance, to decide for each frame the weight for the phoneme-based frequency 
converting condition, wherein the step of selecting at least one of the plurality of predetermined 
frequency conversion coefficients selects the frequency converting condition for the frame by 
using the weight on the phoneme-based frequency converting condition. 

6. (Previously Presented) A method according to claim 1, wherein, said step of 
selecting at least one of the plurality of predetermined frequency conversion coefficients 
employs at least vowels in determining the plurality of similarities or distances. 
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7. (Previously Presented) A method according to claim 1, wherein, said step of 
selecting at least one of the plurality of predetermined frequency conversion coefficients 
employs only vowels in determining the plurality of similarities or distances. 

8. (Currently Amended) An apparatus for speech recognition comprising: 

a feature parameter extracting section for segmenting an input speech utterance into 
frames of a constant time length and extracting an acoustic feature parameter of each frame; 

a frequency converting section for, for each frame, frequency-converting the respective 
acoustic feature parameter by filtering with a plurality of predetermined frequency conversion 
coefficients to form a corresponding plurality of frequency-converted feature parameters; 

a similarity or distance computing section for determining, for each of the frames framc , 
a plurality of similarities or distances between each of the frequency-converted feature 
parameters and a standard phonemic model , the standard phonemic model being a group of 
phonemes : 

a frequency converting condition deciding section for selecting at least one of the 
plurality of predetermined frequency conversion coefficients, representing a frequency 
converting condition for normalizing the input utterance, by using the determined plurality of 
similarities or distances for each of the frames; and 

a speech-recognition processing section for recognizing a speech by using the input 
utterance and a subject-of-recognition acoustic model, 

wherein the input utterance is normalized by frequency-converting the input utterance 
using the selected at least one predetermined frequency conversion coefficient thereby 
effecting speech recognition. 

9. (Previously Presented) An apparatus according to claim 8, wherein the frequency 
converting condition deciding section mutually compares between the determined plurality of 
similarities or distances included in an input frame constituted by the frame, selects for each 
frame a maximum likelihood of combination of a phoneme and at least one of the plurality of 
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predetermined frequency conversion coefficients by using a result of the comparison, and 
cumulates the frequency of the frequency conversion coefficients in the maximum likelihood 
over the plural frames and selects at least one of the plurality of predetermined frequency 
conversion coefficients highest in frequency as the frequency converting condition. 

10. (Previously Presented) An apparatus according to claim 8, wherein the frequency 
converting condition deciding section mutually compares between the determined plurality of 
similarity or distances included in an input frame constituted by the input frame, selects a 
combination of a phoneme of the standard phonemic model and at least one of the plurality of 
predetermined frequency conversion coefficients that provides a result of maximum likelihood, 
and selects at least one of the plurality of predetermined frequency conversion coefficients as 
the frequency converting condition of the frame. 

11. (Previously Presented) An apparatus according to claim 8, wherein the similarity 
or distance computing section computes, for each frame, a ratio in similarity or distance of the 
phoneme as a weight by using the acoustic feature parameter of the frame and the standard 
phonemic model, the frequency converting condition deciding section selecting the frequency 
converting condition by using the weight. 

12. (Previously Presented) An apparatus according to claim 11, wherein the 
similarity or distance computing section selects for each frame at least one of the plurality of 
predetermined frequency conversion coefficients in a maximum likelihood with respect to all the 
phonemes of the standard phonemic model, decides a phoneme-based frequency converting 
condition for all the phonemes, on all the phonemes of the standard phonemic model from a 
result of cumulating phoneme by phoneme the frequency converting condition in a maximum 
likelihood over plural frames, and uses the phoneme-based frequency converting condition and 
the similarity or distance, to decide the weight for the phoneme-based frequency converting 
condition for each frame, wherein the frequency converting condition deciding section selects 
the frequency converting condition for the frame by using the weight on the phoneme-based 
frequency converting condition. 

13. (Previously Presented) An apparatus according to claim 8, wherein said 
frequency converting condition deciding section employs at least vowels in determining the 
plurality of similarities or distances. 
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14. (Previously Presented) An apparatus according to claim 8, wherein said 
frequency converting condition deciding section employs only vowels in determining the 
plurality of similarities or distances. 

15. (Original) An apparatus according to claim 8, comprising a frequency converting 
condition process display section for displaying, for a user, intermediate data obtained by an 
internal process of the frequency converting condition deciding section. 
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